Document FP1 
AppI.No. 10/658,688 

» 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 

Internationa) Bureau 

(43) International Publication Date 
17 January 2002 (17.01.2002) 




PCT 



i ma iiiiidi ee hub urn [in i h ni did iuu mil inn im ihbd an dii iid 

(10) International Publication Number 

WO 02/04646 Al 



(51) Internatiooal Patent Classification 7 : CI 2N 15/70, 
C07K 14/32, A61K 39/07, C12N 15/31, 1/21 

(21) International Application Number: PCT/GBO 1/03065 



(22) International Filing Date: 

(25) Filing Language: 

(26) Publication Language: 



6 July 2001 (06.07.2001) 
English 
English 



(30) Priority Data: 

0016702.3 



8 July 2000 (08.07.2000) GB 



(71) Applicant (for all designated States except US): THE 
SECRETARY OF STATE FOR DEFENCE [GB/GB]; 
DSTL, Porton Down, Salisbury, Wiltshire SP4 OJQ (GB). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): WILLIAMSON, 
Ethel, Diane |GB/GB]; DSTL, Porton Down, Salisbury, 
Wiltshire SP4 OJQ (GB). MILLER, Julie [GB/GB]; 
DSTL, Porton Down, Salisbury, Wiltshire SP4 OJQ 
(GB). WALKER, Nicola, Jane [GB/GB]; DSTL, Porton 
Down, Salisbury, Wiltshire SP4 OJQ (GB). BAILLIE, 
Leslie, William, James [GB/GB]; DSTL, Porton Down, 
Salisbury, Wiltshire SP4 OJQ (GB). H OLDEN, Paula, 
Thomson [GB/GB]; DSTL, Porton Down, Salisbury, 
Wiltshire SP4 OJQ (GB). FLICK-SMITH, Helen, Claire 
[GB/GB]; DSTL, Porton Down, Salisbury, Wiltshire 
SP4 OJQ (GB). BULLIFENT, Helen, Lisa [GB/GB); 
DSTL, Porton Down, Salisbury, Wiltshire SP4 OJQ (GB). 



TITBALL, Richard, William [GB/GB]; DSTL, Porton 
Down, Salisbury, Wiltshire SP4 OJQ (GB). TOPPING 
Andrew William [GB/GB]; 18 Silver Meadows, Barton, 
North Yorkshire DL10 6SL (GB). 

(74) Agent: GREAVES, Carol, Pauline; Greaves Brewster, 
24 A Woodborough Road, Winscombe, North Somerset 
BS25 IAD (GB). 

(SI) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA. BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, 
GM, HR, HU, ID, 1L, IN, IS. JP, KE. KG, KR, KZ, LC, LK, 
LR. LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, 
MZ, NO, NZ, PL, PT, RO, RU, SD, SR, SO, SI, SK, SL, 
TJ, TM, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS ; MW, MZ, SD, SL, SZ, TZ, UG, ZW), Eurasian 
patent (AM, AZ, B Y, KG, KZ, MD, RU, TJ, TM), European 
patent (AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, 
IT, LU, MC, NL, PT, SE, TR), OAPI patent (BF, BJ, CF, 
CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— with international search report 

— before the expiration of the time limit for amending the 
claims and to be republished in the event of receipt of 
amendments 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing al the begin- 
ning of each regular issue of the PCT Gazette. 



< 

VO 

^ (54) Title: EXPRESSION SYSTEM 

o 

£j[ (57) Abstract: An immunogenic reagent which produces an immune response which isprotectrve against Bacillus anlhracis, said 
. reagent compr ising one or more polypeptides which together represent up tothrcc domains of the full length Protective Antigen 
Q (PA) of B . anthracis or variants of these, and at least one of said domains comprises domain 1 or domain 4 of PA or a variant 
thereof/The polypeptides of the immunogenic reagent as well as full length PA are produced by expression from E coli. High yields 
^ of polypeptide are obtained using this method. Cells, vectors and nucleic acids used in the method are also describedand claimed. 



BNSDOCID: <WO 02O*S46A1J_> 



WO 02/04646 



1 



PCT/GB01/03065 



Expression System 

The present invention relates to polypeptides which produce an 
immune response which is protective against infection by 
5 Bacillus anthracls, to methods of producing these, to 

recombinant Escherlschia coli cells, useful in the methods, and 
to nucleic acids and transformation vectors used. 

Present systems for expressing PA for vaccine systems use 
10 protease deficient Bacillus subtilis as the expression host. 
Although such systems are acceptable in terms of product 
quantity and purity, there are significant drawbacks. Firstly, 
regulatory authorities are generally unfamiliar with this host, 
and licensing decisions may be delayed as a result. More 
15 importantly, the currently used strains of Bacillus subtilis 

produce thermostable spores which require the use of a dedicated 
production plant. 

WOOO/02522 describes in particular VEE virus replicons which 
20 express PA or certain immunogenic fragments* 

E. coli is well known as an expression system for a range of 
human vaccines. While the ability to readily ferment E. coli to 
very high cellular densities makes this bacterium an ideal host 

25 for the expression of many proteins, previous attempts to 

express and purify recombinant PA from E. coli cytosol have been 
hindered by low protein yields and proteolytic degradation 
(Singh et al . , J. Biol. Chem. (1989) 264; 11099-11102, Vodkin et 
al., Cell (1993) 34; 693-697 and Sharma et al., Protein Expr. 

30 purif. (1996), 7, 33-38). 

A strategy for overexp res sing PA as a stable, soluble protein in 
the E. coli cytosol has been described recently (Willhite et 
al.. Protein and Peptide Letters, (1998), 5; 273-278). The 
35 strategy adopted is one of adding an affinity tag sequence to 

the N terminus of PA, which allows a simple purification system* 
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A problem with this system is that it requires a further 
downstream processing step in order to remove the tag before the 
PA can be used. 

5 Codon optimisation is a technique which is now well known and 
used in the design of synthetic genes. There is a degree of 
redundancy in the genetic code, in so far as most amino acids 
are coded for by more than one codon sequence. Different 
organisms utilise one or other of these different codons 
10 preferentially. By optimising codons, it is generally expected 
that expression levels of the particular protein will be 
enhanced. 

This is generally desirable, except where, as in the case of PA, 
15 higher expression levels will result in proteolytic degradation 
and/or cell toxicity. In such cases, elevating expression 
levels might be counter-productive and result in significant 
cell toxicity. 

20 Surprisingly however, the applicants have found that this is not 
the case in E. coll and that in this system, codon optimisation 
results in expression of unexpectedly high levels of recombinant 
PA, irrespective of the presence or absence of proteolytic 
enzymes within the strain. 

25 

Furthermore, it would appear that expression of a protective 
domain of PA does not inhibit expression in E. coli. 

The crystal structure of native PA has been elucidated (Petosa 
30 C, et al. Nature 385: 833-838,1997) and shows that PA consists 
of four distinct and functionally independent domains: domain 1, 
divided into la, 1-167 amino acids and lb, 168-258 amino acids; 
domain 2, 259-487 amino acids; domain 3, 488-595 amino acids and 
domain 4, 596-735 amino acids. 

35 
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The applicants have identified that certain domains appear to 
produce surprisingly good protective effects when used in . 
isolation, in fusion proteins or in combination with each other. 

5 According to the present invention there is provided an 

immunogenic reagent which produces an immune response which is 
protective against Bacillus anthracis r said reagent comprising 
one or more polypeptides which together represent up to three 
domains of the full length Protective Antigen (PA) of B. 

10 anthracis or variants of these, and at least one of said domains 
comprises domain 1 or domain 4 of PA or a variant thereof. 

Specifically, the reagent will comprise mixtures of polypeptides 
or fusion peptides wherein individual polypeptides comprise one 
15 of more individual domains of PA. 

In particular, the reagent comprises polypeptide (s) comprising 
domain 1 or domain 4 of PA or a variant thereof, in a form other 
than full length PA. Where present, domains are suitably 
20 complete, in particular domain 1 is present in its entirety. 

The term "polypeptide" used herein includes proteins and 
peptides . 

25 As used herein, the expression "variant" refers to sequences of 
amino acids which differ from the basic sequence in that one or 
more amino acids within the sequence are deleted or substituted 
for other amino acids, but which still produce an immune 
response which is protective against Bacillus anthracis. Amino 

30 acid substitutions may be regarded as "conservative" where an 
amino acid is replaced with a different amino acid with broadly 
similar properties. Non-conservative substitutions are where 
amino acids are replaced with amino acids of a different type. 
Broadly speaking, fewer non-conservative substitutions will be 

35 possible without altering the biological activity of the 

polypeptide. Suitably variants will be at least 60% identical, 
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preferably at least 75% identical, and more preferably at least 
90% identical to the PA sequence. 

In particular, the identity of a particular variant sequence to 
5 the PA sequence may be assessed using the multiple alignment 

method described by Lipman and Pearson, (Lipman, D.J. & Pearson, 
W.R. (1985) Rapid and Sensitive Protein Similarity Searches, 
Science, vol 227, ppl435-1441) . The "optimised" percentage score 
should be calculated with the following parameters for the 
10 Lipman-Pearson algorithm: ktup =1, gap penalty =4 and gap penalty 
length =12 . The sequences for which similarity is to be 
assessed should be used as the "test sequence" which means that 
the base sequence for the comparison, (SEQ ID NO 1) , should be 
entered first into the algorithm. 

15 

Preferably, the reagentof the invention includes a polypeptide 
which has the sequence of domain 1 and/or domain 4 of wild-type 
PA. 

20 A particularly preferred embodiment of the invention comprises 
domain 4 of the PA of B. anthracis. 

These domains comprise the following sequences shown in the 
following Table 1. 



25 Table 1 

Domain Amino acids of full-length PA* 

4 596-735 

1 1-258 



These amino acids numbers refer to the sequence as shown in 
Welkos et al. Gene 69 (1988) 287-300 and are illustrated 
hereinafter as SEQ ID NOs 15 (Fig 4) and 3 (Fig 3) respectively. 

Domain 1 comprises two regions, designated la and lb. Region la 
comprises amino acids 1-167 whereas region lb is from amino acid 
168-258. It appears that region la is important for the 
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production of a good protective response, and the full domain 
may be preferred. 

In a particularly preferred embodiment, a combination of domains 
5 1 and 4 or protective regions thereof, are used as the 

immunogenic reagent which gives rise to an immune response 
protective against B. anthracis. This combination, for example 
as a fusion peptide , may be expressed using the expression 
system of the invention as outlined hereinafter. 

10 

When domain 1 is employed, it is suitably fused to domain 2 of 
the PA sequence, and may preferably be fused to domain 2 and 
domain 3. 

15 Such combinations and their use in prophylaxis or therapy forms 
a further aspect of the invention. 

Suitably the domains described above are part of a fusion 
protein, preferably with an N-terminal glutathione-s-transf erase 
20 protein (GST) . The GST not only assists in the purification of 
the protein, it may also provide an adjuvant effect, possibly as 
a result of increasing the size. 

The polypeptides of the invention are suitably prepared by 
25 conventional methods. For example, they may be synthesised or 
they may be prepared using recombinant DNA technology. In 
particular, nucleic acids which encode said domains are included 
in an expression vector, which is used to transform a host cell. 
Culture of the host cell followed by isolation of the desired 
• 30 polypeptide can then be carried out using conventional methods. 
Nucleic acids, vectors and transformed cells used in these 
methods form a further aspect of the invention. 

Generally speaking, the host cells used will be those that are 
35 conventionally used in the preparation of PA, such as Bacillus 
subtills. 
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The applicants have found surprisingly that the domains either 
in isolation or in combination/ maybe successfully expressed in 
E. coli under certain conditions . 

5 Thus, the present invention further provides a method for 

producing an immunogenic polypeptide which' produces an immune 
response which is protective against B. anthracis, said method 
comprising transforming an E. coll host with a nucleic acid 
which encodes either (a) the protective antigen (PA) of Bacillus 

10 anthracis or a variant thereof which can produce a protective 
immune response, or (b) a polypeptide comprising at least one 
protective domain of the protective antigen (PA) of Bacillus 
anthracis or a variant thereof which can produce a protective 
immune response as described above, culturing the transformed 

15 host and recovering the polypeptide therefrom, provided that 

where the polypeptide is the protective antigen (PA) of Bacillus 
anthracis or a variant thereof which can produce a protective 
immune response, the percentage of guanidine and cytosine 
residues within the said nucleic acid is in excess of 35%. 

20 

Using these options, high yields of product can be obtained 
using a favoured expression host. 

A table showing codons and the frequency with which they appear 
25 in the genomes of Escherichia coli and Bacillus anthracis is 
shown in Figure 1. It is clear that guanidine and cytosine 
appear much more frequently in E.coll than B. anthracis . 
Analysis of the codon usage content reveals the following: 



Species 


1 st letter 
of Codon GC 


2nd letter 
of Codon GC 


3rd letter 
of Codon GC 


Total GC 
content 


E. coli 


58.50% 


40.70% 


54.90% 


51.37% 


B. anthracis 


44.51% 


31.07% 


25.20% 


33.59% 



30 

Thus it would appear that codons which are favoured by E. coli 
are those which include guanidine or cytosine where possible. 
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By increasing the percentage of guanidine and cytosine 
nucleotides in the sequence used to encode the immunogenic 
protein over that normally found in the wild-type B. anthracis 
gene, the codon usage will be such that expression in E. coli is 
5 improved . 

Suitably the percentage of guanidine and cytosine residues 
within the coding nucleic acid used in the invention, at least 
where the polypeptide is the protective antigen (PA) of Bacillus 
10 anthracis or a variant thereof which can produce a protective 
immune response, is in excess of 40%, preferably in excess of 
45% and most preferably from 50-52%. 

High levels of expression of protective domains can be achieved, 
15 with using the wild- type B. anthracis sequence encoding these 
units. However, the yields may be improved further by 
increasing the GC% of the nucleic acid as described above. 

In a particular embodiment, the method involves the expression 
20 of PA of B. anthracis. 

Further according to the present invention, there is provided a 
recombinant Escherischia coli cell which has been transformed 
with a nucleic acid which encodes the protective antigen (PA) of 
25 Bacillus anthracis or a variant thereof which can produce a 
protective immune response, and wherein the percentage of 
guanidine and cytosine residues within the nucleic acid is in 
excess of 35%. 

30 As before, suitably the percentage of guanidine and cytosine 
residues within the coding nucleic acid is in excess of 40%, 
preferably in excess of 45% and most preferably from 50-52%. 

Suitably, the nucleic acid used to transform the E. coli cells 
35 of the invention is a synthetic gene. In particular, the 
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nucleic acid is of SEQ ID NO 1 as shown in Figure 2 or a 
modified form thereof. 

The expression ^modified form" refers to other nucleic acid 
5 sequences which encode PA or fragments or variants thereof which 
produce a protective immune response but which utilise some 
different codons, provided the requirement for the percentage GC 
content in accordance with the invention is met. Suitable 
modified forms will be at least 80% similar, preferably 90% 
10 similar and most preferably at least 95% similar to SEQ ID NO 1. 
In particular, the nucleic acid comprises SEQ ID NO 1. 

In an alternative embodiment, the invention provides a 
recombinant Escherischia coll cell which has been transformed 
15 with a nucleic acid which encodes a protective domain of the 
protective antigen (PA) of Bacillus anthracls or a variant 
thereof which can produce a protective immune response. 

Preferably, the nucleic acid encodes domain 1 or domain 4 of 
20 B. anthracls. 

Further according to the invention there is provided a method of 
producing immunogenic polypeptide which produces an immune 
response which is protective against B. anthracls, said method 
25 comprising culturing a cell as described above and recovering 

the desired polypeptide from the culture. Such methods are well 
known in the art. 

In yet a further aspect, the invention provides an E. coll 
30 transformation vector comprising a nucleic acid which encodes 
the protective antigen (PA) of Bacillus anthracls or a variant 
thereof which can produce a protective immune response, and 
wherein the percentage of guanidine and cytosine residues within 
the nucleic acid is in excess of 35%. 

35 
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A still further aspect of the invention comprises an E. coli 
transformation vector comprising a nucleic acid which encodes a 
protective domain of the protective antigen (PA) of Bacillus 
anthracis or a variant thereof which can produce a protective 
5 immune response. 

Suitable vectors for use in the transformation of E. coli are 
well known in the art. For example, the T7 expression system 
provides good expression levels. However a particularly 
10 preferred vector comprises pAG163 obtainable from Avecia (UK) . 

A nucleic acid of SEQ ID NO 1 or a variant thereof which encodes 
PA and which has at 35%, preferably at least 40%, more 
preferably at least 45% and most preferably from 50-52% GC 
15 content form a further aspect of the invention. 

If desired, PA of the variants, or domains can be expressed as a 
fusion to another protein, for example a protein which provides 
a different immunity, a protein which will assist in 
20 purification of the product or a highly expressed protein (e.g. 
thioredoxin, GST) to ensure good initiation of translation. 

Optionally, additional systems will be added such as T7 lysozyme 
to the expression system, to improve the repression of the 
25 system, although, in the case of the invention, the problems 
associated with cell toxicity have not been noted. 

Any suitable E. coli strain can be employed in the process of 
the invention. Strains which are deficient in a number of 
30 proteases (e.g. Ion", ompT") are available, which would be 

expected to minimise proteolysis. However, the applicants have 
found that there is no need to use such strains to achieve good 
yields of product and that other known strains such as K12 
produce surprisingly high product yields. 

35 
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Fermentation of the strain is generally carried out under 
conventional conditions as would be understood in the art. For 
example, fermentations can be carried out as batch cultures, 
preferably in large shake flasks, using a complex medium 
5 containing antibiotics for plasmid maintenance and with addition 
of IPTG for induction. 

Suitably cultures are harvested and cells stored at -20°C until 
required for purification. 

10 

Suitable purification schemes for E. coli PA (or variant or 
domain) expression can be adapted from those used in B. subtilis 
expression. The individual purification steps to be used will 
depend on the physical characteristics of recombinant PA. 
15 Typically an ion exchange chromatography separation is carried 
out under conditions which allow greatest differential binding 
to the column followed by collection of fractions from a shallow 
gradient. In some cases, a single chromatographic step may be 
sufficient to obtain product of the desired specification. 

20 

Fractions can be analysed for the presence of the product using 
SDS PAGE or Western blotting as required* 

As illustrated hereinafter, the successful cloning and 
25 expression of a panel of fusion proteins representing intact or 
partial domains of rPA has been achieved. The immunogenicity and 
protective efficacy of these fusion proteins against STI spore 
challenge has been assessed in the A/J mouse model. 

30 All the rPA domain proteins were immunogenic in A/J mice and 

conferred at least partial protection against challenge compared 
to the GST control immunised mice. The carrier protein, GST 
attached to the N- terminus of the domain proteins, did not 
impair the immunogenicity of the fusion proteins either in vivo, 

35 shown by the antibody response stimulated in immunised animals, 
or in vitro as the fusion proteins could be detected with anti- 



BNSOOCID: <WQ 02Q464flAl \ > 



WO 02/04646 



PCT/GBO 1/03065 



11 

rPA antisera after Western blotting, indicating that the GST tag 
did not interfere with rPA epitope recognition. Immunisation 
with the larger fusion proteins produced the highest titres. In 
particular, mice immunised with the full length GST 1-4 fusion 

5 protein produced a mean serum anti-rPA concentration 

approximately eight times that of the rPA immunised group 
(Figure 5) . Immunisation of mice with rPA domains 1-4 with the 
GST cleaved off, produced titres of approximately one half those 
produced by immunisation with the fusion protein. Why this 

10 fusion protein should be much more immunogenic is unclear. It is 
possible that the increased size of this protein may have an 
adjuvantising effect on the immune effector cells. It did not 
stimulate this response to the same extent in the other fusion 
proteins and any adjuvantising effect of the GST tag did not 

15 enhance protection against challenge as the cleaved proteins 

were similarly protective to their fusion protein counterparts. 

Despite having good anti-rPA titres, some breakthrough in 
protection at the lower challenge level of 10 2 MLD's, occurred in 

20 the groups immunised with GST1, cleaved 1, GSTlb-2, GSTlb-3 and 
GST1-3 and immunisation with these proteins did not prolong the 
survival time of those mice that did succumb to challenge, 
compared with the GST control immunised mice. This suggests 
that the immune response had not been appropriately primed by 

25 . these proteins to achieve full resistance to the infection. As 
has been shown in other studies in mice and guinea pigs (Little 
S.F. et al. 1986. Infect. Immun. 52: 509-512, Turnbull P.C.B., 
et al., 19B6. Infect. Immun. 52: 356-363) there is no precise 
correlation between antibody titre to PA and protection against 

30 challenge. However a certain threshold of antibody is required 
for protection (Cohen S et. al., 2000 Infect. Immun. 68: 4549- 
4558), suggesting that cell mediated components of the immune 
response are also required to be stimulated for protection 
(Williamson 1989) . 

35 

SUBSTITUTE SHEET (RULE 26) 
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GST1, GSTlb-2 and GST1-2 were the least stable fusion proteins 
produced, as shown by SDS-Page and Western blotting results, 
possibly due to the proteins being more susceptible to 
degradation in the absence of domain 3, and this instability may 
5 have resulted in the loss of protective epitopes. 

The structural conformation of the proteins may also be 
important for stimulating a protective immune response. The 
removal of Domain la from the fusion proteins gave both reduced 

10 antibody titres and less protection against challenge, when 
compared to their intact counterparts GST1-2 and GST1-3. 
Similarly, mice immunised with GST 1 alone were partially 
protected against challenge, but when combined with domain 2, as 
the GST1-2 fusion protein, full protection was seen at the 10 2 

15 MLD challenge level. However the immune response stimulated by 
immunisation with the GST1-2 fusion protein was insufficient to 
provide full protection against the higher 10 3 MLD' s challenge 
level, which again could be due to the loss of protective 
epitopes due to degradation of the protein. 

20 

All groups immunised with truncates containing domain 4, 
including GST 4 alone, cleaved 4 alone and a mixture of two 
individually expressed domains, GST 1 and GST 4 were fully 
protected against challenge with 10 3 MLDs of STI spores (Table 

25 1) . Brossier et al showed a decrease in protection in mice 

immunised with a mutated strain of B.anthracis that expressed PA 
without domain 4 (Brossier F. , et al. 2000. Infect. Immun. 68: 
1781-1786) and this was confirmed in this study, where 
immunisation with GST 1-3 resulted in breakthrough in protection 

30 despite good antibody titres. These data indicate that domain 4 
is the immunodominant sub-unit of PA. Domain 4 represents the 
139 amino acids of the carboxy terminus of the PA polypeptide. 
It contains the host cell receptor binding region (Little S.F. 
et al., 1996 Microbiology 142: 707-715), identified as being in 

35 and near a small loop located between amino acid residues 679- 
693 (Varughese M. , et al. 1999 Infect. Immun. 67:1860-1865). 



BNSOOCIO <WO 0204646A1 Jj> 



WO 02/04646 



PCT/GB01/03065 



13 

Therefore it is essential for host cell intoxication as it has 
been demonstrated that forms of PA expressed containing 
mutations (Varughese 1999 supra.) or deletions (Brossier 1999 
supra.) in the region of domain 4 are non-toxic. The crystal 

5 structure of PA shows domain 4, and in particular a 19 amino 
acid loop of the domain (703-722), to be more exposed than the 
other three domains which are closely associated with each other 
(Petosa 1997 supra.) . This structural arrangement may make 
domain 4 the most prominent epitope for recognition by immune 

10 effector cells, and therefore fusion proteins containing domain 
4 would elicit the most protective immune response. 

This investigation has further elucidated the role of PA in the 
stimulation of a protective immune response demonstrating that 
15 protection against anthrax infection can be attributed to 
individual domains of PA. 

The invention will now be particularly described by way of 
example, with reference to the accompanying drawings in which: 

20 

Figure 1 is a Table of codon frequencies found within E. coli 
and B. anthracis; 

Figure 2 shows the sequence of a nucleic acid according to the 
25 invention, which encodes PA of B. subtilis, as published by 
Welkos et al supra; and 

Figure 3 shows SEQ ID NOs 3-14, which are amino acid and DNA 
sequences used to encode various domains or combinations of 
30 domains of PA as detailed hereinafter; 

Figure 4 shows SEQ ID NOs 15-16 which are the amino acid and DNA 
sequences of domain 4 of PA respectively; and 



35 



Figure 5 is a table showing anti-rPA igG concencentration, 37 
days post primary immunisation, from A/J mice immunised 
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intramuscularly on days 1 and 28 with 10Hg of fusion protein 
included PA fragment; results shown are mean + sem of samples 
taken from 5 mice per treatment group. 

5 Example 1 

Investigation into expression in E.coli 
rPA expression plasmid pAG163::rPA has been modified to 
substitute Km R marker for original Tc R gene. This plasmid has 
been transformed into expression host E. coli BLR (DE3) and 

10 expression level and solubility assessed. This strain is 

deficient in the intracellular protease La (Ion gene product) 
and the outer membrane protease OmpT. 

Expression studies did not however show any improvement in the 
15 accumulation of soluble protein in this strain compared to Ion+ 
K12 host strains (i.e. accumulation is prevented due to 
excessive proteolysis) . It was concluded that any intracellular 
proteolysis of rPA was not due to the action of La protease. 

20 Example 2 

Fermentation analysis 

Further analysis of the fermentation that was done using the K12 
strain UT5600 (DE3) pAG163::rPA. 

25 It was found that the rPA in this culture was divided between 
the soluble and insoluble fractions (estimated 350mg/L 
insoluble, 650mg/L full length soluble) . The conditions used 
(37°C, ImM IPTG for induction) had not yielded any detectable 
soluble rPA in shake flask cultures and given the results 

30 described in Example 1 above, the presence of a large amount of 
soluble rPA is surprising. Nevertheless it appears that 
manipulation of the fermentation, induction and point of harvest 
may allow stable accumulation of rPA in E. coli K12 expression 
strains . 

35 
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Example 3 

A sample of rPA was produced from material initially isolated as 
insoluble inclusion bodies from the DT5600 (DE3) pAG163::rPA 
fermentation. Inclusion bodies were washed twice with 25mM 

5 Tris-HCl pH8 and once with same buffer +2M urea. They were then 
solubilized in buffer +8M urea and debris pelleted. Urea was 
removed by dilution into 25mM Tris-HCl pH8 and static incubation 
overnight at 4°C. Diluted sample was applied to Q sepharose 
column and protein eluted with NaCl gradient. Fractions 

10 containing highest purity rPA were pooled, aliquoted and frozen 
at -70°C. Testing of this sample using 4-12% MES-SDS NuPAGE gel 
against a known standard indicated that it is high purity and 
low in endotoxin contamination. 

15 Example 4 

Further Characterisation of the Product 

N terminal sequencing of the product showed that the N-terminal 
sequence consisted of 

20 MEVKQENRLL (SEQ ID NO 2) 

This confirmed that the product was as expected with initiator 
methionine left on. 

The material was found to react in Western blot; MALDI -MS on 
25 the sample indicated a mass of approx 82 700 (compared to 

expected mass of 82 915) . Given the high molecular mass and 
distance from mass standard used (66KDa), this is considered an 
indication that material does not have significant truncation 
but does not rule out microheterogeneity within the sample. 

30 

Example 5 

Testing of Individual domains of PA 

Individual domains of PA were produced as recombinant proteins 
in E.coll as fusion proteins with the carrier protein 
35 glutathione-s-transferase (GST) , using the Pharmacia pGEX-6P-3 
expression system. The sequences of the various domains and 
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the DNA sequence used to encode them are attached herewith as 
Figure 3. The respective amino acid and DNA sequences are 
provided in Table 2 below. 

5 These fusion proteins were used to immunise A/ J mice (Harlan 
Olac) intra-muscularly with lOug of the respective fusion 
protein adsorbed to 20% v/v alhydrogel in a total volume of 
lOOul. 

10 Animals were immunised on two occasions and their development of 
protective immunity was determined by challenge with spores of 
B.anthracis (STI strain) at the indicated dose levels. The table 
below shows survivors at 14 days post-challenge. 

15 Challenge level in spores /mouse 



Domains 


Amino 
acid 
SEQ 

ID NO 


DNA 
SEQ 
ID 
NO 


5xl0 4 


9x10* 


9x10* 


lxl0 b 


5xl0 b 


GST-1 


3 


4 


4/4 


3/5 








GST-1+2 


5 


6 


4/4; 
5/5 


4/5; 
5/5 








GST-lb+2 


7 


8 


2/5 


1/5 








GST-lb+2+3 


9 


10 


2/5 


3/5 








GST-1+2+3 


11 


12 


Nd 


4/5 


3/5 






GST-1+2+3+4 


13 


14 


Nd 


5/5 


5/5 






1+2+3+4 


13 


14 


Nd 


Nd 




5/5 


5/5 



The data shows that a combination of all 4 domains of PA, 
whether presented as a fusion protein with GST or not, were 
protective up to a high challenge level. Removal of domain 4, 
20 leaving 1+2+3, resulted in breakthrough at the highest challenge 
level tested, 9xl0 5 . Domains 1+2 were as protective as a 
combination of domains 1+2+3 at 9xl0 4 spores. However, removal 
of domain la to leave a GST fusion with domains lb+2, resulted 
in breakthrough in protection at the highest challenge level 
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tested (9xl0 4) which was only slightly improved by adding domain 
3. 

The data indicates that the protective immunity induced by PA 
5 can be attributed to individual domains (intact domain 1 and 
domain 4) or to combinations of domains taken as permutations 
from all 4 domains. 

The amino acid sequence and a DNA coding sequence for domain 4 
10 is shown in Figure 4 as SEQ ID NOs 15 and 16 respectively. 

Example 6 

Further Testing of domains as vaccines 

DNA encoding the PA domains, amino acids 1-259, 168-488, 1-488, 
15 168-596,1-596, 260-735, 489-735, 597-735 and 1-735 (truncates 

GST1, GSTlb-2, GST1-2, GSTlb-3, GST1-3, GST2-4, GST3-4, GST4 and 
GST1-4 respectively) were PCR amplified from B. anthracis Sterne 
DNA and cloned in to the Xhol/Baiwi sites of the expression 
vector pGEX-6-P3 (Amer sham-Pharmacia) downstream and in frame of 
20 the lac promoter. Proteins produced using this system were 

expressed as fusion proteins with an N-terminal glutathione-s- 
transferase protein (GST) . Recombinant plasmid DNA harbouring 
the DNA encoding the PA domains was then transformed in to E. 
coli BL21 for protein expression studies. 

25 

E.coli BL21 harbouring recombinant pGEX-6-P3 plasmids were 
cultured in L-broth containing 50ng/ml ampicillin, 30jig/ml 
chloramphenicol and 1% w/v glucose. Cultures were incubated 
with shaking (170 rev min" 1 ) at 30°C to an A6oon* 0.4, prior to 
30 induction with 0.5mM IPTG. Cultures were incubated for a 

further 4 hours, followed by harvesting by centrifugation at 10 
000 rpm for 15 minutes. 

Initial extraction of the PA truncates-fusion proteins indicated 
35 that they were produced as inclusion bodies. Cell pellets were 
resuspended in phosphate buffered saline (PBS) and sonicated 
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4x20 seconds in an iced water bath. The suspension was 
centrifuged at 15 000 rpm for 15 minutes and cell pellets were . 
then urea extracted, by suspension in 8M urea with stirring at 
room temperature for 1 hour. The suspension was centrifuged for 
5 15 minutes at 15000 rpm and the supernatant dialysed against 
lOOmM Tris pH 8 containing 400mM L-arginine and O.lraM EDTA, 
prior to dialysis into PBS. 

The successful refolding of the PA truncate-fusion proteins 
10 allowed them to be purified on a glutathione Sepharose CL-4B 
affinity column. All extracts (with the exception of truncate 
GSTlb-2, amino acid residues 168-487) were applied to a 15 ml 
glutathione Sepharose CL-4B column (ftmersham-Parmacia) , 
previously equilibrated with PBS and incubated, with rolling, 
15 overnight at 4°C. The column was washed with PBS and the fusion 
protein eluted with 50mM Tris pH7, containing 150mM NaCl, ImM 
EDTA and 20mM reduced glutathione. Fractions containing the PA 
truncates, identified by SDS-PAGE analysis, were pooled and 
dialysed against PBS. Protein concentration was determined 
20 using BCA (Perbio) . 

However truncate GSTlb-2 could not be eluted from the 
glutathione sepharose CL-4B affinity column using reduced 
glutathione and was therefore purified using ion exchange 

25 chromatography. Specifically, truncate GSTlb-2 was dialysed 
against 20mM Tris pH8, prior to loading onto a HiTrap Q column 
(Amersham-Parmacia) , equilibrated with the same buffer. Fusion 
protein was eluted with an increasing NaCl gradient of 0-1M in 
20mM Tris pH8 . Fractions containing the GST-protein were 

30 pooled, concentrated and loaded onto a HiLoad 26/60 Superdex 200 
gel filtration column (Amersham-Parmacia) , previously 
equilibrated with PBS. Fractions containing fusion protein were 
pooled and the protein concentration determined by BCA (Perbio) . 
Yields were between 1 and 43mg per litre of culture. 

35 
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The molecular weight of the fragments and their recognition by 
antibodies to PA was confirmed using SDS PAGE and Western 
Blotting. Analysis of the rPA truncates by SDS Page and Western 
blotting showed protein bands of the expected sizes. Some 
5 degradation in all of the rPA truncates investigated was 

apparent showing similarity with recombinant PA expressed in B. 
subtilis. The rPA truncates GST1, GSTlb-2 and GST1-2 were 
particularly susceptible to degradation in the absence of domain 
3. This has similarly been reported for rPA constructs 
10 containing mutations in domain 3, that could not be purified 
from 3. anthracis culture supernatants (Brossier 1999) , 
indicating that domain 3 may stabilise domains 1 and 2. 

Female, specific pathogen free A/ J mice (Harlan UK) were used in 
15 this study as these are a consistent model for anthrax infection 
(Welkos 1986) . Mice were age matched and seven weeks of age at 
the start of the study. 

A/ J mice were immunised on days 1 and 28 of the study with lOfig 
20 of fusion protein adsorbed to 20% of 1.3% v/v Alhydrogel (HCI 
Biosector, Denmark) in a total volume of 100*11 of PBS. Groups 
immunised with rPA from B. subtllis (Miller 1998), with 
recombinant GST control protein, or fusion proteins encoding 
domains 1, 4 and 1-4 which had the GST tag removed, were also 
25 included. Immunising doses were administered intramuscularly 

into two sites on the hind legs. Mice were blood sampled 37 days 
post primary immunisation for serum antibody analysis by enzyme 
linked immunosorbant assay (EL ISA) . 

30 Microtitre plates (Immulon 2, Dynex Technologies) were coated, 
overnight at 4° C with 5fig/ml rPA, expressed from B.subtilxs 
(Millerl998) , in PBS except for two rows per plate which were 
coated with 5Hg/ml anti-mouse Fab (Sigma, Poole, Dorset) . 
Plates were washed with PBS containing 1% v/v Tween 20 (PBS-T) 

35 and blocked with 5% w/v skimmed milk powder in PBS (blotto) for 
2 hours at 37° C. Serum, double-diluted in 1% blotto, was added 
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to the rPA coated wells and was assayed in duplicate together 
with murine IgG standard (Sigma) added to the anti-fab coated 
wells and incubated overnight at 4° C, After washing, horse- 
radish peroxidase conjugated goat anti-mouse IgG (Southern 

5 Biotechnology Associates Inc.)/ diluted 1 in 2000 in PBS, was 
added to all wells, and incubated for 1 hour at 37° C. Plates 
were washed again before addition of the substrate 2, 2' -Azinobis 
(3-ethylbenzthiazoline-sulfonic acid) (1.09mM ABTS, Sigma) . 
After 20 minutes incubation at room temperature, the absorbance 

10 of the wells at 414nm was measured (Titertek Multiscan, ICN 

Flow) . Standard curves were calculated using Titersoft version 
3.1c software. Titres were presented as ]ig IgG per ml serum and 
group means + standard error of the mean (sem) were calculated. 
The results are shown in Figure 5. 

15 

All the rPA truncates produced were immunogenic and stimulated 
mean serum anti-rPA IgG concentrations in the A/J mice ranging 
from G\xg per ml, for the GSTlb-2 truncate immunised group, to 
1488^g per ml, in the GST 1-4 truncate immunised group (Figure 
20 5) . The GST control immunised mice had no detectable antibodies 
to rPA. 

Mice were challenged with B.anthracis STI spores on day 70 of 
the immunisation regimen. Sufficient STI spores for the 

25 challenge were removed from stock, washed in sterile distilled 
water and resuspended in PBS to a concentration of lxlO 7 and 
lxlO 6 spores per ml. Mice were challenged intraperitoneally with 
0.1ml volumes containing lxlO 6 and lxlO 5 spores per mouse, 
respectively, and were monitored for 14 day post challenge to 

30 determine their protected status. Humane end-points were 

strictly observed so that any animal displaying a collection of 
clinical signs which together indicated it had a lethal 
infection, was culled. The numbers of immunised mice which 
survived 14 days post challenge are shown in Table 3. 
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Table 3 



Challenge Level MLDs 
Domain survivors/no. challenged (%) 





i n 2 

JLU 




i n 3 

1U 


MT r\c 
Mli US 


GST 1 


3/5 


(60) 


1/5 


(20) 


GST lb-2 


1/5 


(20) 


nd 




GST 1-2 


5/5 


(100) 


3/5 


(60) 


GST lb-3 


3/5 


(60) 


nd 




GST 1-3 


4/5 


(80) 


nd 




GST 1-4 


nd 




5/5 


(100) 


GST 2-4 


nd 




5/5 


(100) 


GST 3-4 


nd 




5/5 


(100) 


GST 4 


5/5 


(100) 


5/5 


(100) 


GST 1+ GST 4 


nd 




5/5 


(100) 


Cleaved 1 


1/5 


(20) 


2/5 




Cleaved 4 


5/5 


(100) 


5/5 




Cleaved 1-4 


nd 




5/5 




rPA 


nd 




4/4 


(100) 


control 


0/5 


(0) 


0/5 


(0) 



5 1 MLD = aprox. 1 x 10 3 STI spores 



nd = not done 

The groups challenged with 10 3 MLD' s of STI spores were all 
fully protected except for the GST1, GST1-2 and cleaved 1 
10 immunised groups in which there was some breakthrough in 

protection, and the control group immunised with GST only, which 
all succumbed to infection with a mean time to death (MTTD) of 

2.4 + 0.2 days. At the lower challenge level of 10 2 MLD' s the 
GST1-2, GST4 and cleaved 4 - immunised groups were all fully 

15 protected, but there was some breakthrough in protection in the 
other groups. The mice that died in these groups had a MTTD of 

4.5 + 0.2 days which was not significantly different from the 
GST control immunised group which all died with a MTTD of 4 + 
0 . 4 days . 



BNSDOCID: <WO 02O46«6AlJ_> 



WO 02/04646 



PCT/GB01/03065 



22 
Claims 

1. An immunogenic reagent which produces an immune response 
which is protective against Bacillus anthracls f said reagent 

5 comprising one or more polypeptides which together represent up 
to three domains of the full length Protective Antigen (PA) of 
B. anthracis or variants of these, and at least one of said 
domains comprises domain 1 or domain 4 of PA or a variant 
thereof. 

10 

2. An immunogenic reagent according to claim 1 which 
comprises the sequence of domain 1 and/or domain 4 of wild-type 
PA. 

15 3. An immunogenic reagent according to claim 1 or claim 2 
which comprises domain 4 of the PA of B. anthracis. 

4. An immunogenic reagent according to any one of the 
preceding claims which comprises a combination of domains 1 and 

20 4 or protective regions thereof. 

5. An immunogenic reagent according to claim 4 wherein said 
domains are present in the form of a fusion polypeptide. 

25 6. An immunogenic reagent according to claim 5 which 
comprises domain 1 fused to domain 2 of the PA sequence. 

7. An immunogenic reagent according to claim 6 which is fused 
to domain 3 of the PA sequence. 

30 

8. An immunogenic reagent according to claim 4 which 
comprises a mixture of a polypeptides , one of which comprises 
domain 1 and one of which comprises domain 4 of the PA sequence. 
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9. An immunogenic reagent according to any one of the 
preceding claims wherein a polypeptide is fused to a further 
polypeptide. 

5 10. An immunogenic reagent according to claim 9 wherein said 
further peptide is glutathione-S-transferase (GST) . 

11. A nucleic acid which encodes a polypeptide of an 
immunogenic reagent according to any one of the preceding 

10 claims. 

12. An expression vector comprising a nucleic acid according 
to claim 11. 

15 13. A cell transformed with a vector according to claim 12. 

14. A method for producing an immunogenic polypeptide which 
produces an immune response which is protective against B. 
anthracis, said method comprising transforming an E> coli host 

20 with a nucleic acid which encodes either (a) the protective 
antigen (PA) of Bacillus anthracis or a variant thereof which 
can produce a protective immune response, or (b) a protective 
domain of the protective antigen (PA) of Bacillus anthracis or a 
variant thereof which can produce a protective immune response, 

25 culturing the transformed host and recovering the polypeptide 

therefrom, provided that where the polypeptide is the protective 
antigen (PA) of Bacillus anthracis a variant thereof which can 
produce a protective immune response, the percentage of 
guanidine and cytosine residues within the said nucleic acid is 

30 in excess of 35%. 

15. A method according to claim 14 wherein the said nucleic 
acid encodes the protective antigen (PA) of Bacillus anthracis 
or a variant thereof which can produce a protective immune 

35 response. 
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16. A method according to claim 15 wherein the percentage of 
guanidine and cytosine residues within the said nucleic acid is 
in excess of 45% . 

5 17. A method according to claim 16 wherein the percentage of 
guanidine and cytosine residues within the said nucleic acid is 
from 50-52%. 

18. A method according to claim 14 wherein the said nucleic 

10 acid encodes a protective domain of the protective antigen (PA) 
of Bacillus anthracis or a variant thereof which can produce a 
protective immune response. 

19. A method according to claim 18 wherein the domain is 
15 domain 1 and/or domain 4 of PA of B. anthracis. 

20. A recombinant Escherlschia coli cell which has been 
transformed with a nucleic acid which encodes the protective 
antigen (PA) of Bacillus anthracis or a variant thereof which 

20 can produce a protective immune response, and wherein the 

percentage of guanidine and cytosine residues within the nucleic 
acid is in excess of 35%. 

21. A recombinant Escherlschia coli cell according to claim 20 
25 wherein the percentage of guanidine and cytosine residues within 

the said nucleic acid is in excess of 45%. 

22. A recombinant Escherischia coli cell according to claim 21 
wherein the percentage of guanidine and cytosine residues within 

30 the said nucleic acid is from 50%-52%. 

23. A recombinant E. coli cell according to claim 20 wherein 
said nucleic acid is of SEQ ID NO 1 as shown in Figure 2 or a 
modified form thereof. 

35 
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24. A recombinant E. coli cell according to claim 23 wherein 
said nucleic acid is of SEQ ID NO 1. 

25. A recombinant Escherischia coli cell which has been 

5 transformed with a nucleic acid which encodes a protective 

domain of the protective antigen (PA) of Bacillus anthracis or a 
variant thereof which can produce a protective immune response. 

26. A recombinant cell according to claim 25 wherein the 

10 nucleic acid encodes domain 1 or domain 4 of PA of B. anthracis . 

27. A method of producing a polypeptide which produces an 
immune response which is protective against B. anthracis, said 
method comprising culturing a cell according to any one of 

15 claims 20 to 26 and recovering the protective polypeptide from 
the culture. 

28. An E. coli transformation vector comprising a nucleic acid 
which encodes the protective antigen (PA) of Bacillus anthracis 

20 or a variant thereof which can produce a protective immune 

response, and wherein the percentage of guanidine and cytosine 
residues within the nucleic acid is in excess of 35%. 

29. An E. coli transformation vector comprising a nucleic acid 
25 which encodes a protective domain of the protective antigen (PA) 

of Bacillus anthracis or a variant thereof which can produce a 
protective immune response. 

30. A nucleic acid of SEQ ID NO 1 or a modified form thereof 
30 which encodes PA or a variant thereof which produces a 

protective immune response and which has at least 35% GC 
content . 

31. A nucleic acid according to claim 30 which is at least 90% 
35 identical to SEQ ID NO 1. 
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32. A nucleic acid according to claim 31 which comprises SEQ 
ID NO 1. 

34. A method of preventing or treating infection by B. 
5 anthracis, said method comprising administering to a mammal in 
need thereof , a sufficient amount of an immunogenic reagent 
according to any one of claims 1 to 10. 



10 



35. The use of an immunogenic reagent according to any one of 
claims 1 to 10 in the preparation of a medicament for the 
prophylaxis or treatment of B. anthracis infection. 
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Escherichia coli [gbbcil: 14457 CDS's (4541860 codoas) 
Fields: [triplet] [frequency: per thousand] ([number]) 



UUU 22.0(100128) 

UUC 16. 5( 74885) 

UUA 13.8 ( 62823) 

UUG 13. 3( 60322) 

•CUU H.3( 51442) 
CUC 10.61 48147) 
CUA 4.0( 18067) 
CUG 50.9(231373) 

AUU 29.9(135873) 
AUC 24.6(111878) 
ADA 5.3( 24233) 
AUG 27.2(123604) 

GUU 19. 1( 86572) 
GUC 14.8 ( 67356) 
GUA 11.2 ( 51020) 
GUG 25.5(115687) 



UCU 9.3( 42367) 

UCC 8.9( 40365) 

OCA 7.9( 35837) 

UCG 8.7( 39546) 

CCU 7.2( 32678) 
CCC 5.4( 24383) 
CCA 8.5( 38 663) 
CCG 22.3(1014 67) 

ACU 9.5( 43256) 
ACC 22.7(103121) 
ACA 7.9( 35995) 
ACG 14. 0( 63696) 

GCU 16. 2( 73677) 
GCC 25.0(113412) 
GCA 20. 6 ( 93390) 
GCG 32.2(146264) 



OAU 16. 7( 75774) 
UAC 12. 3( 55847) 
□AA 2.0( 9006) 
OAG 0.3( 1172) 

CAU 12. 7( 57585) 
CAC 9.6( 43743) 
CAA 14 .8( 67129) 
CAG 28.8(130898) 

AAU 18. 7 ( 84846) 
AAC 21. 6 ( 98018) 
AAA 34.4(156169) 
AAG 11. 4( 51685) 

GAU 32.3(14 67 94) 
GAC 19. 3( 87759) 
GAA 39.5(179460) 
GAG 18. 5( 83804) 



UGU 5.2( 23461) 
UGC 6.3{ 28747) 
UGA 1.0 ( 4428) 
UGG 14.5 ( 65630) 

CGU 20. 7 ( 93997) 

CGC 21. 1( 96053) 

CGA 3.7 ( 16607) 

CGG 5.7 ( 25751) 

AGO 9.1( 4154 4) 
AGC 15. 6 ( 70867) 
AGA 2.7 ( 12345) 
AGG 1.6 ( 7423) 

GGU 25.1(114185) 
GGC 28.6(130043) 
GGA 8.6 ( 39036) 
GGG II. K 50527) 



Coding GC 51.37% 1 st letter GC 58.50% 2 pd letter GC 40.70% 3 rd letter GC 54.90% 

Bacillus anthracis [gbbct]: 130 CPS's (52031 codons) m - 

Fields: [triplet] [frequency: per thousand] ([number]) ; 



uuu 


33. 


5( 


1745) 


UCU 


17. 


3( 


■ 902) 


OAU 


34. 


4 ( 


1792) 


UGU 


6. 


K 


319) 


uue 


10. 


2( 


530) 


UCC 


-5. 


3{ 


275) 


UAC 


9 . 


4! 


490) 


UGC 


2. 


1 ( 


107) 


UUA 


44 . 


2( 


2301) 


UCA 


14. 


0{ 


730) 


UAA 


2. 


3( 


113) 


UGA 


0. 


5< 


24) 


UUG 


11. 


3( 


589) 


UCG 


3 . 


6( 


188) 


UAG 


0. 


7( 


37) 


UGG 


9. 


81 


511) 


CUU 


14 . 


7( 


763) 


ecu 


10. 


K 


525) 


CAU 


16. 


8(. 


873) 


CGU 


10. 


9( 


567) 


cue 


3. 


7( 


195) 


CCC 


2. 


7( 


141) 


CAC 


4 . 


6( 


239) 


CGC 


2'. 


6( 


137) 


CUA 


13. 


2( 


686) 


CCA 


14 . 


9 ( 


773) 


CAA 


33. 


7( 


1752) 


CGA 


6. 


8( 


353) 


CUG 


4 , 


-7( 


242) 


CCG 


- 4, 


6( 


237) 


CAG 


10. 


41 


542) 


CGG 


1. 


8 1 


95) 


AUU 


44 


.6( 


2322) 


ACU 


14 


6( 


761) 


AAU 


44. 


6( 


2321) 


AGO 


16. 


5( 


861) 


AUC 


11 


.8C 


616) 


ACC 


5 


.2{ 


269) 


AAC 


13. 


.7( 


711). 


AGC 


5. 


.!( 


266) 


AUA 


24 


.9( 


1295) 


ACA 


25 


• 9( 


1350) 


AAA 


69. 


>5< 


3614) 


AGA 


13 


:8( 


720) 


AUG 


23 


• 8( 


124.0) 


ACG 


8 


. 1 ( 


419) 


AAG 


23. 


• 5( 


1223) 


AGG 


4 


• 3( 


226) 


GUU 


19 


.9( 


1036) 


GCU 


17 


.9( 


930) 


GAU 


39 


• 7( 


2068) 


GGU 


17 


• 3( 


900) 


GUC 


5 


.2( 


263) 


GCC 


4 


.7( 


244) 


GAC 


8 


.81 


456) 


GGC 


5 


.4( 


279) 


GUA 


26 


• 8( 


1395) 


GCA 


22 


.6( 


117B) 


GAA 


55 


• 7( 


2897) 


GGA 


20 


.2( 


1049) 


GUG 


9 


-7( 


507) 


GCG 


7 


-K 


368) 


GAG 


19 


-3( 


1003) 


GGG 


8 


.9( 


461) 



Coding GC 33.59% 1 st letter GC 44.51% 2 nd letter GC 31.07% 3 ri letter GC 25.20% 
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1 AAGCTTCATA TGGAAGTAAA GCAAGAGAAC CGTCTGCTGA ACGAATCTGA ATCCAGCTCT 
61 CAGGGCCTGC TTGGTTACTA TTTCTCTGAC CTGAACTTCC AAGCACCGAT GGTTGTAACC 
121 AGCTCTACCA CTGGCGATCT GTCCATCCCG TCTAGTGAAC TTGAGAACAT TCCAAGCGAG 
181 AACCAGTATT TCCAGTCTGC AATCTGGTCC GGTTTTATCA AAGTCAAGAA ATCTGATGAA 
241 TACACGTTTG CCACCTCTGC TGATAACCAC GTAACCATGT GGGTTGACGA TCAGGAAGTG 
301 ATCAACAAAG CATCCAACTC CAACAAAATT CGTCTGGAAA AAGGCCGTCT GTATCAGATC 
361 AAGATTCAGT ACCAACGCGA GAACCCGACT GAAAAAGGCC TGGACTTTAA ACTGTATTGG 
421 ACTGATTCTC AGAACAAGAA AGAAGTGATC AGCTCTGACA ATCTGCAACT GCCGGAATTG 
481 AAACAGAAAA GCTCCAACTC TCGTAAGAAA CGTTCCACCA GCGCTGGCCC GACCGTACCA 
541 GATCGCGACA ACGATGGTAT TCCGGACTCT CTGGAAGTTG AAGGCTACAC GGTTGATGTA 
601 AAGAACAAAC GTACCTTCCT TAGTCCGTGG ATCTCCAATA TTCACGAGAA GAAAGGTCTG 
661 ACCAAATACA AATCCAGTCC GGAAAAATGG TCCACTGCAT CTGATCCGTA CTCTGACTTT 
721 GAGAAAGTGA CCGGTCGTAT CGACAAGAAC GTCTCTCCGG AAGCACGCCA TCCACTGGTT 
781 GCTGCGTATC CGATCGTACA TGTTGACATG GAAAACATCA TTTTGTCCAA GAACGAAGAC 
841 CAGTCCACTC AGAACACTGA CTCTGAAACT CGTACCATCT CCAAGAACAC CTCCACGTCT 
901 CGTACTCACA CCAGTGAAGT ACATGGTAAC GCTGAAGTAC ACGCCTCTTT CTTTGACATC 
961 GGCGGCTCTG TTAGCGCTGG CTTCTCCAAC TCTAATTCTT CTACTGTTGC CATTGATCAC 
1021 TCTCTGAGTC TGGCTGGCGA ACGTACCTGG GCAGAGACCA TGGGTCTTAA CACTGCTGAT 
1081 ACCGCGCGTC TGAATGCTAA CATTCGCTAC GTCAACACTG GTACGGCACC GATCTACAAC 
1141 GTACTGCCAA CCACCAGCCT GGTTCTGGGT AAGAACCAGA CTCTTGCGAC CATCAAAGCC 
1201 AAAGAGAACC AACTGTCTCA GATTCTGGCA CCGAATAACT ACTATCCTTC CAAGAACCTG 
1261 GCTCCGATCG CACTGAACGC ACAGGATGAC TTCTCTTCCA CTCCGATCAC CATGAACTAC 
1321 AACCAGTTCC TGGAACTTGA GAAGACCAAA CAGCTGCGTC TTGACACTGA CCAAGTGTAC 
1381 GGTAACATCG CGACCTACAA CTTTGAGAAC GGTCGCGTCC GCGTTGACAC AGGCTCTAAT 
1441 TGGTCTGAAG TACTGCCTCA GATTCAGGAA ACCACCGCTC GTATCATCTT CAACGGTAAA 
1501 GACCTGAACC TGGTTGAACG TCGTATTGCT GCTGTGAACC CGTCTGATCC ATTAGAGACC 
1561 ACCAAACCGG ATATGACTCT GAAAGAAGCC CTGAAGATCG CCTTTGGCTT CAACGAGCCG 
1621 AACGGTAATC TTCAGTACCA AGGTAAAGAC ATCACTGAAT TTGACTTCAA CTTTGATCAG 
1681 CAGACCTCTC AGAATATCAA GAACCAACTG GCTGAGCTGA ACGCGACCAA TATCTATACG 
1741 GTACTCGACA AGATCAAACT GAACGCGAAA ATGAACATTC TGATTCGCGA CAAACGTTTC 
1801 CACTACGATC GTAATAACAT CGCTGTTGGC GCTGATGAAT CTGTTGTGAA AGAAGCGGAT 
1861 CGCGAAGTCA TCAACTCCAG CACCGAAGGC CTGCTTCTGA ACATCGACAA AGACATTCGT 
1921 AAGATCCTGT CTGGTTACAT TGTTGAGATC GAAGACACCG AAGGCCTGAA AGAAGTGATC 
1981 AATGATCGTT ACGACATGCT GAACATCAGC TCTCTGCGTC AAGATGGTAA GACGTTCATT 
2041 GACTTCAAGA AATACAACGA CAAACTTCCG CTGTATATCT CTAATCCGAA CTACAAAGTG 
2101 AACGTTTACG CTGTTACCAA AGAGAACACC ATCATCAATC CATCTGAGAA CGGCGATACC 
2161 TCTACCAACG GTATCAAGAA GATTCTGATC TTCTCCAAGA AAGGTTACGA GATCGGTTAA 
2221 TAGGATCC 
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1 EVKQENRLLN ESESSSQGLL GYYFSDLNFQ APMWTSSTT GDLSIPSSEL ENIPSENQYF 
61 QSAIVJSGFIK VKKSDEYTFA TSADNHVTMW VDDQEVINKA SNSNKIRLEK GRLYQIKIQY 
121 QRENPTEKGL DFKLYWTDSQ NKKEVISSDN LQLPELKQKS SNSRKKRSTS AGPTVPDRDN 
181 DGIPDSLEVE GYTVDVKNKR TFLSPWISNI HEKKGLTKYK SSPEKWSTAS DPYSDFEKVT 
241 GRIDKNVSPE ARHPLVAA 

(Seq ID No 3) 



1 gaagttaaac aggagaaccg gttattaaat gaatcagaat caagttccca ggggttacta 
61 ggatactatt ttagtgattt gaattttcaa gcacccatgg tggttacctc ttctactaca 
121 ggggatttat ctattcctag ttctgagtta gaaaatattc catcggaaaa ccaatatttt 
181 caatctgcta tttggtcagg atttatcaaa gttaagaaga gtgatgaata tacatttgct 
241 acttccgctg ataatcatgt aacaatgtgg gtagatgacc aagaagtgat taataaagct 
301 tctaattcta acaaaatcag attagaaaaa ggaagattat atcaaataaa aattcaatat 
361 caacgagaaa atcctactga aaaaggattg gatttcaagt tgtactggac cgattctcaa 
421 aataaaaaag aagtgatttc tagtgataac ttacaattgc cagaattaaa acaaaaatct 
481 tcgaactcaa gaaaaaagcg aagtacaagt gctggaccta cggttccaga ccgtgacaat 
541 gatggaatcc ctgattcatt agaggtagaa ggatatacgg ttgatgtcaa aaataaaaga 
601 acttttcttt caccatggat ttctaatatt catgaaaaga aaggattaac caaatataaa 
661 tcatctcctg aaaaatggag cacggcttct gatccgtaca gtgatttcga aaaggttaca 
721 ggacggattg ataagaatgt atcaccagag gcaagacacc cccttgtggc agct 

{Seq ID No 4) 



1 EVKQENRLLN ESESSSQGLL GYYFSDLNFQ APMWTSSTT GDLSIPSSEL ENIPSENQYF 
61 QSAIWSGFIK VKKSDEYTFA TSADNHVTMW VDDQEVINKA SNSNKIRLEK GRLYQIKIQY 
121 QRENPTEKGL DFKLYWTDSQ NKKEVISSDN LQLPELKQKS SNSRKKRSTS AGPTVPDRDN 
181 DGIPDSLEVE GYTVDVKNKR TFLSPWISNI HEKKGLTKYK SSPEKWSTAS DPYSDFEKVT 
241 GRIDKNVSPE ARHPLVAAYP IVHVDMENII LSKNEDQSTQ NTDSETRTIS KNTSTSRTHT 
301 SEVHGNAEVH ASFFDIGGSV SAGFSNSNSS TVAIDHSLSL AGERTWAETM GLNTADTARL 
361 NANIRYVNTG TAPIYNVLPT TSLVLGKNQT LATIKAKENQ LSQILAPNNY YPSKNLAPIA 
421 LNAQDDFSST PITMNYNQFL ELEKTKQLRL DTDQVYGNIA TYNFENGRVR VDTGSNWSEV 
481 LPQIQET 

(SEQ ID No 5) 
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1 gaagttaaac aggagaaccg gttattaaat gaatcagaat caagttccca ggggttacta 
61 ggatactatt ttagtgattt gaattttcaa gcacccatgg tggttacttc ttctactaca 
121 ggggatttat ctattcctag ttctgagtta gaaaatattc catcggaaaa ccaatatttt 
181 caatctgcta tttggtcagg atttatcaaa gttaagaaga gtgatgaata tacatttgct 
241 acttccgctg ataatcatgt aacaatgtgg gtagatgacc aagaagtgat taataaagct 
301 tctaattcta acaaaatcag attagaaaaa ggaagattat atcaaataaa aattcaatat 
361 caacgagaaa atcctactga aaaaggattg gatttcaagt tgtactggac cgattctcaa 
421 aataaaaaag aagtgatttc tagtgataac ttacaactgc cagaattaaa acaaaaatct 
481 tcgaactcaa gaaaaaagcg aagtacaagt gctggaccta cggttccaga ccgtgacaat 
541 gatggaatcc ctgattcatt agaggtagaa ggatatacgg ttgatgtcaa aaataaaaga 
601 acttttcttt caccatggat ttctaatatt catgaaaaga aaggattaac caaatataaa 
661 tcatctcctg aaaaatggag cacggcttct gatccgtaca gtgatttcga aaaggttaca 
721 ggacggattg ataagaatgt atcaccagag gcaagacacc cccttgtggc agcttatccg 
781 attgtacatg tagatatgga gaatattatt ctctcaaaaa atgaggatca atccacacag 
841 aatactgata gtgaaacgag aacaataagt aaaaatactt ctacaagtag gacacatact 
901 agtgaagtac atggaaatgc agaagtgcat gcgtcgttct ttgatattgg tgggagtgta 
961 tctgcaggat ttagtaattc gaattcaagt acggtcgcaa ttgatcattc actatctcta 
1021 gcaggggaaa gaacttgggc tgaaacaatg ggtttaaata ccgctgatac agcaagatta 
1081 aatgccaata ttagatatgt aaatactggg acggctccaa tctacaacgt gttaccaacg 
1141 acttcgttag tgttaggaaa aaatcaaaca ctcgcgacaa ttaaagctaa ggaaaaccaa 
1201 ttaagtcaaa tacttgcacc taataattat tatccttcta aaaacttggc gccaatcgca 
1261 ttaaatgcac aagacgattt cagttctact ccaattacaa tgaattacaa tcaatttctt 
1321 gagttagaaa aaacgaaaca attaagatta gatacggatc aagtatatgg gaatatagca 
1381 acatacaatt ttgaaaatgg aagagtgagg gtggatacag gctcgaactg gagtgaagtg 
1441 ttaccgcaaa ttcaagaaac a 



(SEQ ID No 6) 



1 SAGPTVPDRD NDGIPDSLEV EGYTVDVKNK RTFLSPWISN IHEKKGLTKY KSSPEKWSTA 
61 SDPYSDFEKV TGRIDKNVSP EARHPLVAAY PIVHVDMENI ILSKNEDQST QNTDSETRTI 
121 SKNTSTSRTH TSEVHGNAEV HASFFDIGGS VSAGFSNSNS STVAIDHSLS LAGERTWAET 
181 MGLNTADTAR LNANIRYVNT GTAPIYNVLP TTSLVLGKNQ TLATIKAKEN QLSQILAPNN 
241 YYPSKNLAPI ALNAQDDFSS TPITMNYNQF LELEKTKQLR LDTDQVYGNI ATYN FENGRV 
301 RVDTGSNWSE VLPQIQET 



(SEQ ID NO 7) 
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1 agtgctggac ctacggttcc agaccgtgac 
61 gaaggatata cggttgatgt caaaaataaa 
121 attcatgaaa agaaaggatt aaccaaatat 
181 tctgatccgt acagtgattt cgaaaaggtt 
241 gaggcaagac acccccttgt ggcagcttat 
301 attctctcaa aaaatgagga tcaatccaca 
361 agtaaaaata cttctacaag taggacacat 
421 catgcgtcgt tctttgatat tggtgggagt 
481 agtacggtcg caattgatca ttcactatct 
541 atgggtttaa ataccgctga tacagcaaga 
601 gggacggctc caatctacaa cgtgttacca 
661 acactcgcga caattaaagc taaggaaaac 
721 tattatcctt ctaaaaactt ggcgccaatc 
781 actccaatta caatgaatta caatcaattt 
841 ttagatacgg atcaagtata tgggaatata 
901 agggtggata caggctcgaa ctggagtgaa 

(SEQ ID No 



aatgatggaa tccctgattc attagaggta 
agaacttttc tttcaccatg gatttctaat 
aaatcatctc ctgaaaaatg gagcacggct 
acaggacgga ttgataagaa tgtatcacca 
ccgattgtac atgtagatat ggagaatatt 
cagaatactg atagtgaaac gagaacaata 
actagtgaag tacatggaaa tgcagaagtg 
gtatctgcag gatttagtaa ttcgaattca 
ctagcagggg aaagaacttg ggctgaaaca 
ttaaatgcca atattagata tgtaaatact 
acgacttcgt tagtgttagg aaaaaatcaa 
caattaagtc aaatacttgc acctaataat 
gcattaaatg cacaagacga tttcagttct 
cttgagttag aaaaaacgaa acaattaaga 
gcaacataca attttgaaaa tggaagagtg 
gtgttaccgc aaattcaaga aaca 

8) 



1 SAGPTVPDRD NDGIPDSLEV EGYTVDVKNK RTFLSPWISN IHEKKGLTKY KSSPEKWSTA 
61 SDPYSDFEKV TGRIDKNVSP EARHPLVAAY PIVHVDMENI ILSKNEDQST QNTDSETRTI 
121 SKNTSTSRTH TSEVHGNAEV HASFFDIGGS VSAGFSNSNS STVAIDHSLS LAGERTWAET 
181 MGLNTADTAR LNANIRYVNT GTAPIYNVLP TTSLVLGKNQ TLATIKAKEN QLSQILAPNN 
241 YYPSKNLAPI ALNAQDDFSS TPITMNYNQF LELEKTKQLR LDTDQVYGNI ATYNFENGRV 
301 RVDTGSNWSE VLPQIQETTA RIIFNGKDLN LVERRIAAVN PSDPLETTKP DMTLKEALKI 
361 AFGFNEPNGN LQYQGKDITE FDFNFDQQTS QNIKNQLAEL NATNIYTVLD KIKLNAKI4NI 
421 LIRDKR 

(SEQ ID No 9} 
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1 agtgctggac ctacggttcc agaccgtgac 
61 gaaggatata cggttgatgt caaaaataaa 
121 attcatgaaa agaaaggatt aaccaaatat 
181 tctgatccgt acagtgattt cgaaaaggtt 
241 gaggcaagac acccccttgt ggcagcttat 
301 attctctcaa aaaatgagga tcaatccaca 
361 agtaaaaata cttctacaag taggacacat 
421 catgcgtcgt tctttgatat tggtgggagt 
481 agtacggtcg caattgatca ttcactatct 
541 atgggtttaa ataccgctga tacagcaaga 
601 gggacggctc caatctacaa cgtgttacca 
661 acactcgcga caattaaagc taaggaaaac 
721 tattatcctt ctaaaaactt ggcgccaatc 
781 actccaatta caatgaatta caatcaattt 
841 ttagatacgg atcaagtata tgggaatata 
901 agggtggata caggctcgaa ctggagtgaa 
961 cgtatcattt ttaatggaaa agatttaaat 
1021 cctagtgatc cattagaaac gactaaaccg 
1081 gcatttggat ttaacgaacc gaatggaaac 
1141 tttgatttta atttcgatca acaaacatct 
1201 aacgcaacta acatatatac tgtattagat 
1261 ttaataagag ataaacgt 



aatgatggaa tccctgattc attagaggta 
agaacttttc tttcaccatg gatttctaat 
aaatcatctc ctgaaaaatg gagcacggct 
acaggacgga ttgataagaa tgtatcacca 
ccgattgtac atgtagatat ggagaatatt 
cagaatactg atagtgaaac gagaacaata 
actagtgaag tacatggaaa tgcagaagtg 
gtatctgcag gatttagtaa ttcgaattca 
ctagcagggg aaagaacttg ggctgaaaca 
ttaaatgcca atattagata tgtaaatact 
acgacttcgt tagtgttagg aaaaaatcaa 
caattaagtc aaatacttgc acctaataat 
gcattaaatg cacaagacga tttcagttct 
cttgagttag aaaaaacgaa acaattaaga 
gcaacataca attttgaaaa tggaagagtg 
gtgttaccgc aaattcaaga aacaactgca 
ctggtagaaa ggcggatagc ggcggttaat 
gatatgacat taaaagaagc ccttaaaata 
ttacaatatc aagggaaaga cataaccgaa 
caaaatatca agaatcagtt agcggaatta 
aaaatcaaat taaatgcaaa aatgaatatt 



(SEQ ID No 10) 



1 EVKQENRLLN ESESSSQGLL GYYFSDLNFQ APMWTSSTT GDLSIPSSEL ENIPSENQYF 
61 QSAIWSGFIK VKKSDEYTFA TSADNHVTMW VDDQEVINKA SNSNKIRLEK GRLYQIKIQY 
121 QRENPTEKGL DFKLYWTDSQ NKKEVTSSDN LQLPELKQKS SNSRKKRSTS AGPTVPDRDN 
181 DGIPDSLEVE GYTVDVKNKR TFLSPWISNI HEKKGLTKYK SSPEKWSTAS DPYSDFEKVT 
241 GRIDKNVSPE ARHPLVAAYP IVHVDMENII LSKNEDQSTQ NTDSETRTIS KNTSTSRTHT 
301 SEVHGNAEVH ASFFDIGGSV SAGFSNSNSS TVMDHSLSL AGERTWAETM GLNTADTARL 
361 NANIRYVNTG TAPIYNVLPT TSLVLGKNQT LATIKAKENQ LSQILAPNNY YPSKNLAPIA 
421 LNAQDDFSST PITMNYNQFL ELEKTKQLRL DTDQVYGNIA TYNFENGRVR VDTGSNWSEV 
481 LPQIQETTAR IIFNGKDLNL VERRIAAVNP SDPLETTKPD MTLKEALKIA FGFNEPNGNL 
541 QYQGKDITEF DFNFDQQTSQ NIKNQLAELN ATNIYTVLDK IKLNAKMNIL IRDKR 

(SEQ ID No 11) 
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1 gaagttaaac aggagaaccg gttattaaat 
61 ggatactatt ttagtgattt gaattttcaa 
121 ggggatttat ctattcctag ttctgagtta 
181 caatctgcta tttggtcagg atttatcaaa 
241 acttccgctg ataatcatgt aacaatgtgg 
301 tctaattcta acaaaatcag attagaaaaa 
361 caacgagaaa atcctactga aaaaggattg 
421 aataaaaaag aagtgatttc tagtgataac 
481 tcgaactcaa gaaaaaagcg aagtacaagt 
541 gatggaatcc ctgattcatt agaggtagaa 
601 acttttcttt caccatggat ttctaatatt 
661 tcatctcctg aaaaatggag cacggcttct 
721 ggacggattg ataagaatgt atcaccagag 
781 attgtacatg tagatatgga gaatattatt 
841 aatactgata gtgaaacgag aacaataagt 
901 agtgaagtac atggaaatgc agaagtgcat 
961 tctgcaggat ttagtaattc gaattcaagt 
1021 gcaggggaaa gaacttgggc tgaaacaatg 
1081 aatgccaata ttagatatgt aaatactggg 
1141 acttcgttag tgttaggaaa aaatcaaaca 
1201 ttaagtcaaa tacttgcacc taataattat 
1261 ttaaatgcac aagacgattt cagttctact 
1321 gagttagaaa aaacgaaaca attaagatta 
1381 acatacaatt ttgaaaatgg aagagtgagg 
1441 ttaccgcaaa ttcaagaaac aactgcacgt 
1501 gtagaaaggc ggatagcggc ggttaatcct 
1561 atgacattaa aagaagccct taaaatagca 
1621 caatatcaag ggaaagacat aaccgaattt 
1681 aatatcaaga atcagttagc ggaattaaac 
1741 atcaaattaa atgcaaaaat gaatatttta 

(SEQ ID No 



gaatcagaat caagttccca ggggttacta 
gcacccatgg tggttacctc ttctactaca 
gaaaatattc catcggaaaa ccaatatttt 
gttaagaaga gtgatgaata tacatttgct 
gtagatgacc aagaagtgat -taataaagct 
ggaagattat atcaaataaa aattcaatat 
gatttcaagt tgtactggac cgattctcaa 
ttacaattgc cagaattaaa acaaaaatct 
gctggaccta cggttccaga ccgtgacaat 
ggatatacgg ttgatgtcaa aaataaaaga 
catgaaaaga aaggattaac caaatataaa 
gatccgtaca gtgatttcga aaaggttaca 
gcaagacacc cccttgtggc agcttatccg 
ctctcaaaaa atgaggatca atccacacag 
aaaaatactt ctacaagtag gacacatact 
gcgtcgttct ttgatattgg tgggagtgta 
acggtcgcaa ttgatcattc actatctcta 
ggtttaaata ccgctgatac agcaagatta 
acggctccaa tctacaacgt gttaccaacg 
ctcgcgacaa ttaaagctaa ggaaaaccaa 
tatccttcta aaaacttggc gccaatcgca 
ccaattacaa tgaattacaa tcaatttctt 
gatacggatc aagtatatgg gaatatagca 
gtggatacag gctcgaactg gagtgaagtg 
atcattttta atggaaaaga tttaaatctg 
agtgatccat tagaaacgac taaaccggat 
tttggattta acgaaccgaa tggaaactta 
gattttaatt tcgatcaaca aacatctcaa 
gcaactaaca tatatactgt attagataaa 
ataagagata aacgt 

12) 



1 EVKQENRLLN ESESSSQGLL GYYFSDLNFQ APMWTSSTT GDLSIPSSEL ENIPSENQYF 
61 QSAIWSGFIK VKKSDEYTFA TSADNHVTMW VDDQEVINKA SNSNKIRLEK GRLYQIKIQY 
121 QRENPTEKGL DFKLYWTDSQ NKKEVISSDN LQLPELKQKS SNSRKKRSTS AGPTVPDRDN 
181 DGIPDSLEVE GYTVDVKNKR TFLSPWISNI HEKKGLTKYK SSPEKWSTAS DPYSDFEKVT 
241 GRIDKNVSPE ARHPLVAAYP IVHVDMENII LSKNEDQSTQ NTDSQTRTIS KNTSTSRTHT 
301 SEVHGNAEVH ASFFDIGGSV SAGFSNSNSS TVAIDHSLSL AGERTWAETM GLNTADTARL 
361 NANIRYVNTG TAPIYNVLPT TSLVLGKNQT LATIKAKENQ LSQILAPNNY YPSKNLAPIA 
421 LNAQDDFSST PITMNYNQFL ELEKTKQLRL DTDQVYGNIA TYNFENGRVR VDTGSNWSEV 
481 LPQIQETTAR IIFNGKDLNL VERRIAAVNP SDPLETTKPD MTLKEALKIA FGFNEPNGNL 
541 QYQGKDITEF DFNFDQQTSQ NIKNQLAELN ATNIYTVLDK IKLNAKMNIL IRDKRFHYDR 
601 NNIAVGADES WKEAHREVI NSSTEGLLLN IDKDIRKILS GYIVEIEDTE GLKEVINDRY 
661 DMLNISSLRQ DGKTFIDFKK YNDKLPLYIS NPNYKVNVYA VTKENTIINP SENGDTSTNG 
721 IKKILIFSKK GYEIG 

(SEQ ID No 13) 
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1 gaagttaaac aggagaaccg gttattaaat gaatcagaat caagttccca ggggttacta 
61 ggatactatt ttagtgattt gaattttcaa gcacccatgg tggttacctc ttctactaca 
121 ggggatttat ctattcctag ttctgagtta gaaaatattc catcggaaaa ccaatatttt 
181 caatctgcta tttggtcagg atttatcaaa gttaagaaga gtgatgaata tacatttgct 
241 acttccgctg ataatcatgt aacaatgtgg gtagatgacc aagaagtgat taataaagct 
301 tctaattcta acaaaatcag attagaaaaa ggaagattat atcaaataaa aattcaatat 
361 caacgagaaa atcctactga aaaaggattg gatttcaagt tgtactggac cgattctcaa 
421 aataaaaaag aagtgatttc tagtgataac ttacaattgc cagaattaaa acaaaaatct 
481 tcgaactcaa gaaaaaagcg aagtacaagt gctggaccta cggttccaga ccgtgacaat 
541 gatggaatcc ctgattcatt agaggtagaa ggatatacgg ttgatgtcaa aaataaaaga 
601 acttttcttt caccatggat ttctaatatt catgaaaaga aaggattaac caaatataaa 
661 tcatctcctg aaaaatggag cacggcttct gatccgtaca gtgatttcga aaaggttaca 
721 ggacggattg ataagaatgt atcaccagag gcaagacacc cccttgtggc agcttatccg 
781 attgtacatg tagatatgga gaatattatt ctctcaaaaa atgaggatca atccacacag 
841 aatactgata gtgaaacgag aacaataagt aaaaatactt ctacaagtag gacacatact 
901 agtgaagtac atggaaatgc agaagtgcat gcgtcgttct ttgatattgg tgggagtgta 
961 tctgcaggat ttagtaattc gaattcaagt acggtcgcaa ttgatcattc actatctcta 
1021 gcaggggaaa gaacttgggc tgaaacaatg ggtttaaata ccgctgatac agcaagatta 
1081 aatgccaata ttagatatgt aaatactggg acggctccaa tctacaacgt gttaccaacg 
1141 acttcgttag tgttaggaaa aaatcaaaca ctcgcgacaa ttaaagctaa ggaaaaccaa 
1201 ttaagtcaaa tacttgcacc taataattat tatccttcta aaaacttggc gccaatcgca 
1261 ttaaatgcac aagacgattt cagttctact ccaattacaa tgaattacaa tcaatttctt 
1321 gagttagaaa aaacgaaaca attaagatta gatacggatc aagtatatgg gaatatagca 
1381 acatacaatt ttgaaaatgg aagagtgagg gtggatacag gctcgaactg gagtgaagtg 
1441 ttaccgcaaa ttcaagaaac aactgcacgt atcattttta atggaaaaga tttaaatctg 
1501 gtagaaaggc ggatagcggc ggttaatcct agtgatccat tagaaacgac taaaccggat 
1561 atgacattaa aagaagccct taaaatagca tttggattta acgaaccgaa tggaaactta 
1621 caatatcaag ggaaagacat aaccgaattt gattttaatt tcgatcaaca aacatctcaa 
1681 aatatcaaga atcagttagc ggaattaaac gcaactaaca tatatactgt attagataaa 
1741 atcaaattaa atgcaaaaat gaatatttta ataagagata aacgttttca ttatgataga 
1801 aataacatag cagttggggc ggatgagtca gtagttaagg aggctcatag agaagtaatt 
1861 aattcgtcaa cagagggatt attgttaaat attgataagg atataagaaa aatattatca 
1921 ggttatattg tagaaattga agatactgaa gggcttaaag aagttataaa tgacagatat 
1981 gatatgttga atatttctag tttacggcaa gatggaaaaa catttataga ttttaaaaaa 
2041 tataatgata aattaccgtt atatataagt aatcccaatt ataaggtaaa tgtatatgct 
2101 gttactaaag aaaacactat tattaatcct agtgagaatg gggatactag taccaacggg 
2161 atcaagaaaa ttttaatctt ttctaaaaaa ggctatgaga taggataa 

(SEQ ID No 14) 



Figure 3 Cont. 
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1 FHYDRNNIAV GADESWKEA HREVINSSTE GLLLNIDKDI RKILSGYIVE IEDTEGLKSV 
61 INDRYDMLNI SSLRQDGKTF IDFKKYNDKL PLYISNPNYK VNVYAVTKEN TIINPSENGD 
121 TSTNGIKKIL IFSKKGYEIG 

(SEQ ID No 15) 



1 tttcattatg atagaaatsa catagcagtt ggggcggatg agtcagtagt taaggaggct 

61 catagagaag taattaattc gtcaacagag ggattattgt taaatattga taaggatata 

121 agaaaaatat tatcaggtta tattgtagaa attgaagata ctgaagggct taaagaagtt 

181 ataaatgaca gatatgatat gttgaatatt tctagtttac ggcaagatgg aaaaacattt 

241 atagatttta aaaaatataa tgataaatta ccgttatata taagtaatcc caattataag 

301 gtaaatgtat atgctgttac taaagaaaac actattatta atcctagtga gaatggggat 

361 actagtacca acgggatcaa gaaaatttta atcttttcta aaaaaggcta tgagatagga 

421 taa 

(SEQ ID No 16) 



Figure 4 
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